Unsupervised statistical clustering of environmental shotgun sequences
نویسندگان
چکیده
منابع مشابه
Unsupervised Two-Way Clustering of Metagenomic Sequences
A major challenge facing metagenomics is the development of tools for the characterization of functional and taxonomic content of vast amounts of short metagenome reads. The efficacy of clustering methods depends on the number of reads in the dataset, the read length and relative abundances of source genomes in the microbial community. In this paper, we formulate an unsupervised naive Bayes mul...
متن کاملMetaGene: prokaryotic gene finding from environmental genome shotgun sequences
Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGen...
متن کاملErratum to “Unsupervised Two-Way Clustering of Metagenomic Sequences”
and Bahrad Sokhansanj, " Metagenome fragment classification using N-mer frequency profiles, " Advances in Bioinfor-matics, Volume 2008 (2008). "
متن کاملUnsupervised view and rate invariant clustering of video sequences
Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for...
متن کاملGibbsCluster: unsupervised clustering and alignment of peptide sequences
Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motif...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2009
ISSN: 1471-2105
DOI: 10.1186/1471-2105-10-316